Elastic Resource Provisioning for Batched Stream Processing System in Container Cloud

نویسندگان

  • Song Wu
  • Xingjun Wang
  • Hai Jin
  • Haibao Chen
چکیده

Batched stream processing systems achieve higher throughput than traditional stream processing systems while providing low latency guarantee. Recently, batched stream processing systems tend to be deployed in cloud due to their requirement of elasticity and cost efficiency. However, the performance of batched stream processing systems are hardly guaranteed in cloud because static resource provisioning for such systems does not fit for stream fluctuation and uneven workload distribution. In this paper, we propose EStream: an elastic batched stream processing system based on Spark Streaming, which transparently adjusts available resource to handle workload fluctuation and uneven distribution in container cloud. Specifically, EStream can automatically scale cluster when resource insufficiency or over-provisioning is detected under the situation of workload fluctuation. On the other hand, it conducts resource scheduling in cluster according to the workload distribution. Experimental results show that EStream is able to handle workload fluctuation and uneven distribution transparently and enhance resource efficiency, compared to original Spark Streaming.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost-efficient enactment of stream processing topologies

The continuous increase of unbound streaming data poses several challenges to established data stream processing engines. One of the most important challenges is the cost-efficient enactment of stream processing topologies under changing data volume. These data volume pose different loads to stream processing systems whose resource provisioning needs to be continuously updated at runtime. First...

متن کامل

An Elastic Data Stream Processing Ecosystem for Distributed Environments

In the last couple of years, we have observed a trend towards an ever-growing number and volume of data streams. Up to now, these data streams were mainly originating from social media services but today the emergence of the Internet of Things (IoT) also contributes to the growth of data streams. Besides the growth of the data volume, the IoT also introduces several new challenges, like the geo...

متن کامل

On the Cost-QoE Trade-off for Cloud Media Streaming under Amazon EC2 Pricing Models

Exponential growth of video traffic challenges the current paradigm to stream large amounts of video contents to end users. Cloud computing with elastic resource allocation supported enables cost-effective video streaming with desired QoE requirements. We abstract a new theoretical model from real systems for elastic media streaming by introducing a virtual content service provider that rents c...

متن کامل

Maximum Sustainable Throughput Prediction for Large-Scale Data Streaming Systems

In cloud-based stream processing services, the maximum sustainable throughput (MST) is defined as the maximum throughput that a system composed of a fixed number of virtual machines (VMs) can ingest indefinitely. If the incoming data rate exceeds the system’s MST, unprocessed data accumulates, eventually making the system inoperable. Thus, it is important for the service provider to keep the MS...

متن کامل

Elastic Allocation of Docker Containers in Cloud Environments

Docker containers wrap up a piece of software together with everything it needs for the execution and enable to easily run it on any machine. For their execution in the Cloud, we need to identify an elastic set of virtual machines that can accommodate those containers, while considering the diversity of their requirements. In this paper, we briefly describe our formulation of the Elastic provis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017